Using a stochastic context-free grammar as a language model for speech recognition

نویسندگان

  • Daniel Jurafsky
  • Chuck Wooters
  • Jonathan Segal
  • Andreas Stolcke
  • Eric Fosler-Lussier
  • Gary Tajchman
  • Nelson Morgan
چکیده

This paper describes a number of experiments in adding new grammatical knowledge to the Berkeley Restaurant Project (BeRP), our medium-vocabulary (1300 word), speaker-independent, spontaneous continuous-speech understanding system (Jurafsky et al. 1994). We describe an algorithm for using a probabilistic Earley parser and a stochastic context-free grammar (SCFG) to generate word transition probabilities at each frame for a Viterbi decoder. We show that using an SCFG as a language model improves word error rate from 34.6% (bigram) to 29.6% (SCFG), and semantic sentence recognition error from from 39.0% (bigram) to 34.1% (SCFG). In addition, we get a further reduction to 28.8% word error by mixing the bigram and SCFG LMs. We also report on our preliminary results from using discourse-context information in the LM.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A language model combining trigrams and stochastic context-free grammars

We propose a class trigram language model in which each class is specified by a stochastic context-free grammar. We show how to estimate the parameters of the model, and how to smooth these estimates. We present experimental perplexity and speech recognition results.

متن کامل

A tree-trellis n-best decoder for stochastic context-free grammars

In this paper a decoder for continuous speech recognition using stochastic context-free grammars is described. It forms the backbone of the ACE recognizer, which is a modular system for real-time speech recognition. A new rationale for automata is introduced, as well as a new model for pruning the search space.

متن کامل

Computation of the Probability of the Best Derivation of an Initial Substring from a Stochastic Context-Free Grammar

Recently, Stochastic Context-Free Grammars have been considered important for use in Language Modeling for Automatic Speech Recognition tasks [6, 10]. In [6], Jelinek and Lafferty presented and solved the problem of computation of the probability of initial substring generation by using Stochastic Context-Free Grammars. This paper seeks to apply a Viterbi scheme to achieve the computation of th...

متن کامل

A Language Model Combining Trigrams and Context-Free Grammars

We present a class trigram language model in which each class is specified by a probabilistic context-free grammar. We show how to estimate the parameters of the model, and how to smooth these estimates. Experimental perplexity and speech recognition results are presented.

متن کامل

Studying impressive parameters on the performance of Persian probabilistic context free grammar parser

In linguistics, a tree bank is a parsed text corpus that annotates syntactic or semantic sentence structure. The exploitation of tree bank data has been important ever since the first large-scale tree bank, The Penn Treebank, was published. However, although originating in computational linguistics, the value of tree bank is becoming more widely appreciated in linguistics research as a whole. F...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995